Skyline Queries over Incomplete Data - Cost Models

نویسندگان

  • Christoph Lofi
  • Kinda El Maarry
  • Wolf-Tilo Balke
چکیده

Skyline queries are a well-known technique for explorative retrieval, multi-objective optimization problems, and personalization tasks in databases. They are widely acclaimed for their intuitive query formulation mechanisms. However, when operating on incomplete datasets, skyline query processing is severely hampered and often has to resort to error-prone heuristics. Unfortunately, incomplete datasets are a frequent phenomenon due to widespread use of automated information extraction and aggregation. In this paper, we evaluate and compare various established heuristics for adapting skylines to incomplete datasets, focusing specifically on the error they impose on the skyline result. Building upon these results, we argue for improving the skyline result quality by employing crowd-enabled databases. This allows dynamic outsourcing of some database operators to human workers, therefore enabling the elicitation of missing values during runtime. Unfortunately, each crowd-sourcing operation will result in monetary and query runtime costs. Therefore, our main contribution is introducing a sophisticated error model, allowing us to specifically concentrate on those tuples that are highly likely to be error-prone, while relying on established heuristics for safer tuples. This technique of focused crowd-sourcing allows us to strike a perfect balance between costs and result’s quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Processing Skyline Queries in Incomplete Database: Issues, Challenges and Future Trends

Corresponding Author: Ali A. Alwan Department of Computer Science, Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia Email: [email protected] Abstract: In many contemporary database applications such as multi-criteria decision-making and real-time decision-support applications, data mining, ecommerce and recommendati...

متن کامل

Efficiently Evaluating Skyline Queries on RDF Databases

Skyline queries are a class of preference queries that compute the pareto-optimal tuples from a set of tuples and are valuable for multi-criteria decision making scenarios. While this problem has received significant attention in the context of single relational table, skyline queries over joins of multiple tables that are typical of storage models for RDF data has received much less attention....

متن کامل

Skyline Queries over Incomplete Multidimensional Database

In recent years, there has been much focus on skyline queries that incorporate and provide more flexible query operators that return data items which are dominating other data items in all attributes (dimensions). Several techniques for skyline have been proposed in the literature. Most of the existing skyline techniques aimed to find the skyline query results by supposing that the values of di...

متن کامل

Finding Skylines for Incomplete Data

In the last decade, skyline queries have been extensively studied for different domains because of their wide applications in multi-criteria decision making and search space pruning. A skyline query returns all the interesting points in a multi-dimensional data set that are not dominated by any other point with respect to all dimensions. However, real world data sets are seldom complete, i.e. d...

متن کامل

A Model for Processing Skyline Queries in Crowd-sourced Databases

Received Jan 15, 2018 Revised Mar 29, 2018 Accepted Apr 11, 2018 Nowadays, in most of the modern database applications, lots of critical queries and tasks cannot be completely addressed by machine. Crowdsourcing database has become a new paradigm for harness human cognitive abilities to process these computer hard tasks. In particular, those problems that are difficult for machines but easier f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013